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Abstract 

The goal of the current research was to better understand when and why feedback has positive 
effects on learning and to identify features of feedback that may improve its efficacy. In a 
randomized experiment, second-grade children received instruction on a correct problem-solving 
strategy and then solved a set of relevant problems. Children were assigned to receive no 
feedback, immediate feedback, or summative feedback from the computer. On a posttest the 
following day, feedback resulted in higher scores relative to no feedback for children who started 
with low prior knowledge. Immediate feedback was particularly effective, facilitating mastery of 
the material for children with both low and high prior knowledge. Results suggest that minimal 
computer-generated feedback can be a powerful fonn of guidance during problem solving. 

Key words: feedback, problem solving, computer learning, mathematics learning 
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The benefits of computer-generated feedback for mathematics problem solving 

Feedback often has powerful, positive effects for children across development. For 
example, feedback has been shown to improve performance for preschoolers on a card sort task 
(Bohlmann & Fenson, 2005), middle-school students on a writing assignment (Gielen et ah, 
2010), and undergraduates on a multiple-choice test of general knowledge (Butler & Roediger, 
2008). However, the effects of feedback are not universally beneficial (Mory, 2004). The goal of 
the current research is to better understand when feedback has positive effects and to identify 
features of feedback that may improve its efficacy. Specifically, we manipulated the presence 
and timing of feedback to experimentally test their impact on children’s mathematics learning. 

The Mixed Effects of Feedback 

In most learning contexts, the purpose of feedback is to provide infonnation that the 
learner can use to confirm or modify prior knowledge. This feedback can promote the correction 
of errors (Kulhavy, 1977) and increase motivation (Mory, 2004). In many cases, feedback is 
helpful as intended and improves learning and performance. Indeed, meta-analyses consistently 
show that, on average, feedback has a positive effect relative to no feedback (Bangert-Drowns et 
ah, 1991; Kluger & DeNisi, 1996). However, the impact of feedback varies, and under some 
circumstances, feedback can have neutral or negative effects (see Hattie & Gan, 2011). For 
example, negative effects have occurred for both right/wrong feedback (Pashler et al., 2005) and 
correct-answer feedback (Hays, Kornell, & Bjork, 2010) on adults’ word learning. 

Recent research suggests that learners who already have some knowledge in the domain 
are especially likely to experience negative effects of feedback. For example, in Fyfe, Rittle- 
Johnson, and DeCaro (2012), second- and third-grade children solved novel math problems. 
Some children received trial-by-trial right/wrong feedback, while others received no feedback. 



COMPUTER-GENERATED FEEDBACK 


4 


For children with low prior knowledge on the pretest, feedback resulted in higher posttest scores 
than no feedback. However, for children with higher prior knowledge, feedback resulted in lower 
posttest scores than no feedback. Similar results were found when prior knowledge was 
manipulated via instruction on a correct strategy (Fyfe & Rittle-Johnson, 2015). Given that 
feedback can hinder learning under certain circumstances, more research is needed to understand 
when and why this occurs and to identify features of feedback that may improve its efficacy. 

The Timing of Feedback 

The timing of feedback may be one feature that influences the efficacy of feedback. 

Some researchers believe that feedback should be given immediately after a response in order to 
eliminate incorrect ways of thinking and reinforce correct ones (Skinner, 1954). Further, 
immediate feedback may provide motivation to practice, as progress can be easily monitored 
(Shute, 2008). However, others believe delaying feedback is more beneficial. First, it may 
prevent learners from becoming over-reliant on the immediate presentation of feedback, which in 
turn may increase the need to exert effort on one’s own response (Bangert-Drowns et ah, 1991). 
Second, delaying feedback allows for the strength of initially incorrect responses to dissipate, 
which may make processing correct responses easier (Kulhavy, 1977). Finally, delaying 
feedback allows for spaced presentation of information (Butler, Karpicke, & Roediger, 2007). 

Several meta-analyses point to the advantages of immediate feedback, particularly in 
computer based instruction (Azevedo & Bernard, 1995) and applied classroom studies (Kulik & 
Kulik, 1988). Indeed, multiple experiments have demonstrated the superiority of immediate 
feedback over delayed feedback for the acquisition of verbal materials (Beeson, 1973; Brosvic, 
Dihoff, & Epstein, & Cook, 2006; Dihoff, Brosvic, & Epstein, 2003). However, a substantial 
body of research has found no difference between immediate and delayed feedback (e.g., Nakata, 



COMPUTER-GENERATED FEEDBACK 


5 


2015; Smits, Boon, Sluijsmans, & Van Gog, 2008) or significant advantages of delayed feedback 
(Bangert-Drowns et al., 1991; Butler et al., 2007; Butler & Roediger, 2008; Clariana et al., 2000; 
Kulhavy, 1977; Metcalfe, Komell, & Finn, 2009; Smith & Kimball, 2010). For example, Butler 
and Roediger (2008) had undergraduate students study general knowledge passages and take a 
multiple-choice test. Feedback after each response resulted in a lower proportion of correct 
responses on a one-week posttest than delayed feedback after all test questions were completed. 

Recent work has found advantages of delaying feedback even after controlling for key 
confounds. For example, Metcalfe et al. (2009) found that delayed feedback was more effective 
than immediate feedback for sixth-grade students’ vocabulary learning after controlling for the 
shorter retention interval between delayed feedback and the time of testing (but see Nakata, 
2015). Recently, Mullet et al. (2014) found benefits of delayed feedback after controlling for 
time spent processing the feedback. Specifically, undergraduate engineering students had to view 
the feedback on their weekly homework assignments in order to get credit, regardless of when 
the feedback was provided. Students who received feedback one week after the assignments 
scored higher on a final exam than students who received feedback right after each assignment. 

Thus, in general the evidence on the timing of feedback is mixed, although more and 
more research is finding benefits of delayed feedback. However, there are still several gaps in the 
literature that need to be addressed. First, the majority of research is in the context of adults 
learning from multiple-choice tests and the effects of feedback on their recall of information. 
Although this work is informative, it may not generalize to children learning mathematics and 
their ability to generate a solution strategy to a class of novel problems. 

Second, there has been no systematic, experimental investigation of the timing of 
feedback in relation to learners’ prior knowledge. As mentioned above, prior knowledge often 
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predicts whether learners benefit from the presence versus absence of feedback (e.g., Fyfe et ah, 
2012; Gielen et ah, 2010; Krause et ah, 2009). Thus, it seems plausible that prior knowledge may 
also predict learning from immediate versus delayed feedback. Indeed, several researchers have 
suggested that low-knowledge learners may benefit from immediate feedback as they need to 
correct initial errors, but high-knowledge learners may benefit from delayed feedback as they 
need to process problems deeply with little intrusion (e.g., Gaynor, 1981). 

Third, little attention has been paid to how the timing of feedback impacts learners’ 
affective states. A leading theory suggests that feedback is less likely to be effective when it 
directs attention to the self as opposed to the task, because attention on the self can produce 
affective reactions that interfere with task-relevant processing (Kluger & DeNisi, 1996). For 
example, negative feedback can produce ego-threat, which may reduce one’s confidence or 
motivation to continue. One possibility is that immediate trial-by-trial feedback results in higher 
negative affect because it is provided during the learning task. An alternative possibility is that 
delayed, summative feedback results in higher negative affect because it draws a lot of attention 
to the self all at once, with no subsequent task on which to redeem one’s self-image. 

Current Study 

The goal of the current study was to address these gaps in the literature. Specifically, we 
manipulated the presence and timing of feedback to experimentally test their impact on 
mathematics learning for children with varying levels of prior knowledge. We based our study 
design on our previous work (Fyfe & Rittle-Johnson, 2015). In that study, children who received 
prior instruction on the math problems exhibited better learning on the posttest when they 
received practice without feedback than practice with feedback. Further, the negative effect of 
feedback did not depend on children’s prior knowledge on a pretest. Here, we used the same 
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basic method and assessments. For example, the instruction script and practice problems were 
identical to those in the previous study, as were the pretest and posttest items. However, we 
made key changes to enhance the external validity of the results. 

First, the tutor’s presence was removed during problem solving and the feedback was 
provided solely by the computer. The use of computers and computer tutor systems in 
classrooms is increasing rapidly (Greaves & Hayes, 2008), and these systems often provide 
targeted practice with feedback. Indeed, the oft-cited benefit of computer programs is the ability 
to give immediate feedback on every student’s responses. Thus, it is imperative to understand 
how the presence and timing of computer-generated feedback affects mathematics learning. 
Second, the feedback in this study did not provide an explicit right/wrong judgment. It only 
contained the correct answer so as to focus students’ attention on the problem solution as 
opposed to whether the learner was right or wrong. Third, the posttest was administered the next 
day, rather than immediately following the task, so as to measure more stable knowledge change. 

In the experiment, we examined the impact of feedback for children learning to solve 

math equivalence problems (e.g., 3+4+5=3+_), which require an understanding that both sides 

of an equation represent the same quantity. We hypothesized that feedback would result in 
higher posttest scores relative to no feedback, particularly for children with low prior knowledge. 

Method 

Participants 

Initial participants were 88 second-grade children from one public and one private 
school. Of these children, 77 met criteria for participation because they scored below 80% on a 
problem-solving screening measure. This ensured that children had room to leam from the 
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intervention. Data from two children were excluded for failing to complete all activities. The 
final sample contained 75 children (Mage = 8.2 yrs, range = 7.4-9.2 yrs; 41 girls; 34 boys). 

Design 

The study had a between-subjects design with children randomly assigned to conditions: 
no feedback (n= 24), immediate feedback (n= 25), and summative feedback (n= 26). There were 
no significant differences between conditions in terms of age or gender, ps>A. 

Materials 

Screening Measure. The screening measure was three tasks that tap understanding of 
math equivalence (from McNeil et al., 2011). For equation solving, children solved five math 
equivalence problems. For equation encoding, children reconstructed four math equivalence 
problems after viewing each for five seconds to assess how they mentally represented the 
structure of the problem. They received one point for each accurate reconstruction (up to four 
points). For defining the equal sign, children provided a written definition of the equal sign and 
received one point if they provided a relational definition (e.g., “the same amount”). 

Intervention Problems. The 12 intervention problems consisted mostly of four- and 
five-addend math equivalence problems with operations on both sides of the equal sign, with the 

unknown after the equal sign (e.g., 3+7=_+6) or at the end (e.g., 5+3+9=5+_). Three problems 

had an operation on the right side only (e.g., 9=6+_). 

Self-Assessment. We obtained children’s subjective ratings of self-assessment to explore 
whether they considered their performance to reflect negatively on their traits. We adapted a 
measure used with kindergarten students in Kamins and Dweck (1999). Children responded to 
each of four items (e.g., “The problem-solving task made me feel like I was a smart student.”) on 
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a 4-point scale ranging from strongly disagree to strongly agree. Scores on each item were 
averaged to form a single score out of four (a=.80). 

Posttest. The posttest, adapted from past work (Rittle-Johnson et ah, 2011), was a 
broader measure that assessed equation-solving success. It included 8 items that assessed 
children’s use of correct strategies to solve math equivalence problems (a=.87). Half of the items 
were similar to those presented during the intervention, and half differed on a key problem 

feature, such as inclusion of subtraction (i.e., 5+6-3=5+_). We administered several additional 

measures that were not informative for the current results and are thus not discussed further. 

Coding. We coded children’s problem-solving strategies on the screening measure, 
intervention problems, and posttest, based on their numerical answers (e.g., for the problem 

2+7=6+_, an answer of 15 indicated an incorrect “add all” strategy and an answer of 3 indicated 

a correct strategy). Responses within one of the correct answer were coded as correct. To 
establish inter-rater reliability, a second rater (who was blind to the initial strategy code) coded 
children’s strategies on 30% of all problems. Inter-rater agreement was high (kappa=.98). 
Procedure 

Children completed the screening measure in their classrooms in a 10-minute session. 
Those who met the inclusion criteria then completed a one-on-one tutoring intervention in a 
single session lasting approximately 25 minutes. This session was conducted in a quiet area at 
the school with one of two trained tutors, and included strategy instruction and problem solving. 

Strategy Instruction. All children received instruction on a correct problem-solving 

strategy with four math equivalence problems with the blank at the end (e.g., 3+4+2=3+_). 

Children were instructed on the commonly used equalize strategy, which involves adding the 
numbers on one side of the equal sign and then counting up from the number on the other side to 
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get the same amount. Children were asked to answer questions to ensure they were attending to 
instruction. To check that the instruction worked, all children then solved a math equivalence 
problem on their own. If they solved the problem correctly, they proceeded to the next activity. If 
they solved it incorrectly, instruction on the equalize strategy was repeated without revealing the 
correct answer, and they were asked to solve another problem until they solved one correctly 
(although the children were not told this criteria). We set the protocol such that after three failed 
attempts the experiment was discontinued and children received more remedial tutoring. Two 
children were excluded for this reason resulting in a sample of 73 children. 

Problem Solving. Children were then asked to solve 12 math problems presented one at 
a time on a computer (see Materials for a description of the problems). First, the tutor described 
the task and whether or not the computer would provide feedback. Second, during the problem¬ 
solving task, the tutor’s presence was removed. Specifically, children were told they would work 
on the computer by themselves for this portion of the session so that they could work at their 
own pace and not worry about the tutor. Then, the tutor sat a short distance away and engaged in 
a different task (e.g., read a book) until the child had completed the problem-solving task. 

In the no-feedback condition, children did not receive feedback and were simply directed 
to click when they were ready for the next problem. In the immediate-feedback condition, 
children received trial-by-trial correct-answer feedback after each problem. In the summative- 
feedback condition, children received correct-answer feedback after all 12 problems had been 
solved. The problems with the child’s answers reappeared one at a time on the computer screen 
along with the feedback message. The problems and correct answers remained on the screen 
while the next problem appeared (up to four problems at a time), which allowed some 
spontaneous comparison across problems during the provision of feedback, much like a 
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summative answer key. In both feedback conditions, the feedback was presented visually on the 
computer screen for each problem (e.g., “10 is the correct answer.”). Importantly, it did not 
contain an explicit right/wrong verification, it only contained the correct answer. Although the 
right/wrong judgment was implicit (via comparison of the child’s answer with the correct 
answer), there were no explicit signals from the computer (e.g., check mark, error noise, etc.). 

Following problem solving, children rated their self-assessment, and then returned to 
class. The following day, children completed the posttest. Due to absences, one child in the no¬ 
feedback condition completed the posttest 2 days later and two children in the immediate- 
feedback condition completed the posttest 4 and 5 days later. 

Data Analysis 

Children’s scores on most measures were analyzed using ANCOVAs. Scores that were 
not normally distributed were converted into a dichotomous outcome (i.e., 1 for success and 0 
otherwise) and logistic regression was used to predict the odds of success. In all models, 
condition was the dependent variable. It was dummy coded with immediate feedback and 
summative feedback entered into the models, and no feedback as the reference group. Children’s 
age and score on the screening measure were entered as covariates. To test if the effects of 
condition depended on prior knowledge, the interaction between condition and screening 
measure was included. Thus, each model included two condition variables, age (mean centered), 
screening measure score (mean centered), and two condition by screening measure interactions. 

Results 

Screening Measure 

On average, children in the final sample (n= 73) solved 1.4 (50=1.3) problems correctly 
(out of 5), encoded 1.8 (50=1.4) problems correctly (out of 4), and 14% provided a relational 
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definition of the equal sign. Perfonnance did not differ by condition on any task,/?s>.5. As in 
prior work (McNeil et al., 2011; Fyfe & Rittle-Johnson, 2015), we created a composite measure 
by summing z-scores across the three tasks (M=0.00, SD= 2.09, range=-2.75 to 6.15). The 
composite score served as a prior knowledge covariate in subsequent analyses. 

Intervention Problems 

We analyzed performance on the intervention problems to examine how feedback 
impacted the on-going task. Across all 12 problems the frequency of correct strategy use was 
similar in the immediate-feedback (M= 80%, SD=19%), summative-feedback (M= 78%, SD 
=29%), and no-feedback (M= 73%, SD-29%) conditions. An ANCOVA revealed no main effect 
of immediate-feedback, p=. 16, or summative-feedback, p=. 14, relative to no-feedback. There 
was no summative-feedback by prior knowledge interaction, p=. 40; but, there was a marginal 
immediate-feedback by prior knowledge interaction, F(l, 66) = 3.83 ,p = .055, q p 2 = .06. 

To follow up the interaction, we tested the effect of immediate-feedback for children with 
lower and higher prior knowledge. Screening measure scores were centered at one standard 
deviation below the mean in one ANCOVA and one standard deviation above the mean in a 
second ANCOVA (see Aiken & West, 1991). For children with low prior knowledge, there was 
a main effect of immediate-feedback, F{ 1, 66) = 5.50,/? = .02, q p 2 = .08. The model estimates a 
-20% differential between immediate- and no-feedback (M = 71% vs. 50%). But, for children 
with high prior knowledge, there was no main effect of immediate-feedback, F (\, 66) = 0.11 ,p = 
.74, q p 2 = .00. The model estimates a -3% differential favoring no-feedback (M= 89% vs. 92%). 
Thus, immediate trial-by-trial feedback boosted the problem-solving performance of children 
with low prior knowledge, but had minimal impact for children with higher prior knowledge. 
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Descriptively, a trial-by-trial plot of intervention performance suggests differences were 
strongest on the last few items (see Figure 1). Indeed, if we split the intervention problems into 
three blocks (first four problems, middle four problems, and last four problems), significant 
effects only emerged on the last block. On these last four problems, the raw frequency of correct 
strategy use was high in the immediate-feedback condition (M= 88%, 579= 18%), but lower in the 
summative- (M= 77%, 579=35%), and no-feedback (M=71%, 579= 33%) conditions. There was a 
significant main effect of immediate-feedback, F( \, 66) = 6.87, p = .01, q p 2 = .10, and no effect 
of summative-feedback, p=. 13. Again, a significant immediate-feedback by prior knowledge 
interaction, F{ 1, 66) = 6.16,/?= .02, q p 2 = .09, indicated that the positive effect of immediate 
feedback was present for low-knowledge, but not high-knowledge, children. Immediate feedback 
seemed to help children maintain strong perfonnance on difficult problems. For example, 
children who did not receive feedback during the task scored higher on the first, simpler block of 
problems than on the last, more difficult block of problems (M = 80% vs. 74%). However, 
children who received trial-by-trial feedback maintained high scores from the first block to the 
last block ( M= 85% vs. 88%), despite the increase in problem difficulty. 

Although most children used a correct strategy on the majority of problems, their errors 
were infonnative for examining the types of incorrect strategies they employed. Across all 
children, the most common incorrect strategies were to add the numbers before the equal sign 
(11% of all trials) and to add all the numbers in the problem (5% of all trials). For example, for 

the problem 3 + 7 =_+ 6, an answer of 10 indicates an add-before-equal sign strategy and an 

answer of 16 indicates an add-all-numbers strategy. Children in the immediate-feedback 
condition used these two incorrect strategies less often (M=10%, 579=27%) than children in the 
summative- (M=17%, 579=26%) and no-feedback conditions (M=21%, 579=27%). An ANCOVA 
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confirmed this revealing a main effect of immediate-feedback, F(\ , 66) = 4.21 ,p= .04, r| p 2 = .06, 
no main effect of summative-feedback, p = .19, and no interactions with prior knowledge, 
/?s>.10. Overall, these two strategies accounted for 61% of all the errors made. This was 
particularly true for children who did not receive feedback during the task (accounting for 68% 
of errors in summative- and no-feedback, but only 46% of errors in immediate-feedback). 


Figure 1. Trial-by-trial performance on each intervention problem by condition 



Intervention Problem 


Self-Assessment 

Children’s ratings of positive self-assessment after the problem-solving task were similar 
in the immediate-feedback (M= 3.5 out of 4, 679=0.4) and no-feedback (M=3.4, 679=0.4) 
conditions, but lower in the summative-feedback condition (M= 3.1, 579=0.5). An ANCOVA 
revealed no main effect of immediate-feedback, p=. 73, but a significant main effect of 
summative-feedback relative to no-feedback F( 1, 66) = 4.33, p = .04, r\ p 2 = .06. There were no 




COMPUTER-GENERATED FEEDBACK 


15 


condition-by-prior knowledge interactions, ps>. 14. A follow-up analysis revealed a significant 
difference between the two feedback types, F{ 1, 66) = 6.08 ,p= .02, r| p 2 = .09. 

Across all children, ratings of self-assessment were not correlated with perfonnance on 
the intervention problems, r(71) = .14,/? = .26, or on the posttest, r(71) = .10,/? = .41. Looking 
by condition, the correlations were negligible in the no-feedback condition (intervention 
problems, r( 22) = .08 ,p = .72, and posttest, r( 22) = .01 ,p = .95) and in the immediate-feedback 
condition (intervention, r( 22) = -.08, p = .72, and posttest, r( 22) = -.09, p = .69). However, the 
correlations were moderate, though non-significant, in the summative-feedback condition 
(intervention problems, r(23) = .30, p = .14, and posttest, r(23) = .30 ,p = .14). This suggests the 
ratings of self may have played a larger role when the feedback was provided all at once. 

Posttest 

On the next-day posttest, percent correct was highest with immediate-feedback (M= 86%, 
579=22%), lower with summative-feedback (M= 78%, 579=27%), and lowest with no-feedback 
(M= 65%, 579=38%). An ANCOVA revealed main effects of immediate-feedback, F(\ , 66) = 
10.02 ,p = .002, r| p 2 = .13, and summative-feedback, F( 1, 66) = 5.0 6,p = .03, r| p 2 = .07, relative to 
no-feedback. However, prior knowledge interacted with both immediate-feedback, F( 1, 66) = 
5.78, p = .02, r| p 2 = .08, and summative-feedback, F( 1, 66) = 4.56, p = .04, r| p 2 = .07. 

To follow up the interactions, two ANCOVAs tested the effect of condition for children 
with lower and higher prior knowledge (see Figure 2). For children with low prior knowledge, 
there were large effects of both immediate-feedback, F{ 1, 66) = 14.61 ,p< .000, r| p 2 = .18, and 
summative-feedback, F{ 1, 66) = 9.46,/? = .003, r| p 2 = .13. The model estimates a -35-45% 
differential between the feedback conditions and no-feedback (see Figure 2). But, for children 
with high prior knowledge, there were no main effects of immediate-feedback, F( \ , 66) = 0.39,/? 
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= .53, r| p 2 = .01, or summative-feedback, F{ 1, 66) = 0.00,/? = .99, r| p 7 = .00. The model estimates 
a 0-5% differential between the feedback conditions and no feedback (see Figure 2). 



Note. Estimated scores are plotted at plus/minus one standard deviation from the mean. 


However, scores on the posttest were high and not normally distributed as the majority of 
children (75%) solved more than half of the items correctly with a full 47% of children solving 
all of the items correctly. We used logistic regression to predict the log of the odds of scoring 
100% correct. More children in the immediate-feedback condition (63%) scored 100% on the 
posttest relative to the summative-feedback (40%) and no-feedback (38%) conditions. There was 
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a main effect of immediate-feedback, P = 1.14, p = .03, OR = 5.72, but no effect of summative- 
feedback, P = 0.49,/? = .48, OR = 1.63, relative to no-feedback. A follow-up analysis revealed a 
marginal difference between the two feedback types, P = 1.25,/? = .08, OR = 3.50. Prior 
knowledge did not interact with immediate-feedback or summative-feedback, ps > .10. 

The results are similar, though less robust, if we use less stringent criteria. For example, 
in a logistic regression predicting the log of the odds of scoring 75% or higher, there is a 
marginal effect of immediate-feedback, P = 1.79,/? = .08, OR = 6.01, no effect of summative- 
feedback, P = 0.51,/? = .49, OR = 1.66, and no interactions with prior knowledge, ps > .15. 

Together, these results suggest that immediate and summative feedback improve posttest 
scores relative to no feedback for low-knowledge learners. However, only immediate feedback 
improves mastery of the material, and this is true for both low- and high-knowledge learners. To 
better understand these condition effects, we further explored perfonnance in two ways. First, we 
split the posttest into two subscales: learning and transfer (see Table 1). On the learning items, 
scores were similar with immediate- (M= 88%, SD = 19%) and summative-feedback (M= 85%, 
SD = 27%), and both were higher than no-feedback (M= 72%, SD = 35%). But, on the transfer 
items, scores were highest with immediate-feedback (M= 83%, SD = 29%), lower with 
summative-feedback (M= 70%, SD = 36%), and lowest with no-feedback (M= 58%, SD = 

45%). Indeed, for learning scores, there were significant main effects of both immediate- 
feedback, F (\, 66) = 6.52,/? = .01, r| p 2 = .09, and summative-feedback, F (\, 66) = 5.60,/? = .02, 
r| p 2 = .08. For transfer scores, the main effect was significant for immediate-feedback, F{ 1, 66) = 
8.01,/? = .006, r| p 2 = .11, but not for summative-feedback, F{ \, 66) = 2.39, p = .13, r) p 2 = .04. 
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Table 1 


Posttest Learning Subscale (a = .72) 

Posttest Transfer Subscale (a = .86) 

Structurally similar to problems presented 
during the intervention 

Contained novel feature: new position of the 
blank or included subtraction 

+ 

II 

00 

_+ 2 = 6 + 4 

3 + 4 =_+5 

8+ =8+6+4 

3 + 7 + 6 =_+ 6 

5 + 6- 3 = 5 +_ 

7 + 6 + 4 = 7 +_ 

5-2 + 4 =_+ 4 


Second, we compared the distribution of scores on the intervention problem-solving task 
and on the posttest (see Figure 3). Descriptively, an interesting pattern emerged such that the 
biggest change occurred in the immediate-feedback condition. Immediate feedback seemed to 
push children who exhibited moderate performance during the intervention toward levels of 
mastery on the posttest. In contrast, for the summative-feedback condition, the distribution of 
scores was nearly identical at intervention and posttest. There was some change in the no¬ 
feedback condition such that the extremes were more common (relatively low or relatively high 
performance) at posttest. These patterns suggest that children who did not receive feedback 
during the intervention task either got it or not - and this was reflected on the posttest as well. 
However, children who received immediate trial-by-trial feedback were able to leam during the 
task and exhibit their better understanding of the problems on the next-day posttest. 
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Figure 3. Distribution of scores by condition at intervention and at posttest. 
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Discussion 

The results from the current study show positive effects of computer-generated feedback 
on mathematics learning for children who received prior instruction on the problems. During the 
tutoring session, immediate feedback boosted problem-solving perfonnance and decreased the 
use of common incorrect strategies for low-knowledge children. Immediate feedback did not 
impact ratings of self-assessment, but summative feedback led to lower ratings of self- 
assessment. On the next-day posttest, both immediate and summative feedback resulted in higher 
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equation-solving success than no feedback for low-knowledge children. However, only 
immediate feedback facilitated mastery for both low- and high-knowledge children. 

Despite increasing evidence in favor of delaying the presentation of feedback, the present 
findings indicate that immediate feedback may be more effective for promoting children’s 
mathematics problem solving (see also Brosvic et ah, 2006). In addition to facilitating learning, 
immediate feedback facilitated transfer to novel problems and mastery of the material. Several 
pieces of evidence suggest that these benefits may be attributable to the trial-by-trial nature of 
the feedback and children’s opportunity to use the feedback on subsequent problems. First, 
immediate feedback improved perfonnance on the last block of training problems, but not the 
first block. This suggests children were able to learn from the initial feedback and maintain high 
performance on the later, more difficult problems. Second, immediate feedback reduced the use 
of the incorrect ‘add-all’ and ‘add-to-equal’ strategies. That is, when children in the immediate 
feedback condition made errors, they were less likely to use common incorrect strategies that 
stem from entrenched misconceptions (McNeil & Alibali, 2005) and more likely to try 
something different. Third, only immediate feedback resulted in a big change in the distribution 
of children at mastery. This suggests a progression of knowledge such that children started out 
variable, learned from the feedback, and were able to exhibit mastery by the posttest. This is in 
contrast to the no-feedback and summative-feedback groups which were more stable. 

Indeed, with one exception, children in the summative-feedback condition perfonned 
similarly to children in the no-feedback condition, indicating that summative feedback had 
minimal benefits. One reason may have been the lack of an opportunity to use the feedback right 
away. Unlike immediate feedback, there was no subsequent problem on which to try a new 
strategy until the next-day posttest. Another reason may have been increased attention on the 
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self. Summative feedback led to lower ratings of positive self-assessment relative to no feedback 
and immediate feedback. Further, these ratings in the summative feedback condition were 
moderately, negatively correlated with posttest perfonnance (although these correlations did not 
reach standard levels of significance). Thus, it seems possible that receiving feedback on all of 
the problems at once increased attention on the self, produced a negative affective response (e.g., 
ego-threat), and influenced children’s general confidence or motivation to leam. 

Although these two reasons may play a role in explaining why summative feedback was 
minimally beneficial, there are other possibilities related to the design of the current experiment. 
First, the summative feedback may have been less visually and cognitively overwhelming if each 
problem had been presented individually (as opposed to remaining on the screen). Second, 
summative feedback may have been more effective given a longer retention interval between the 
instructional session and posttest. Indeed, Butler et al. (2007) found that the benefits of delayed 
feedback over immediate feedback emerged on a one-week posttest, but not on a one-day 
posttest. Further, experiments with motor tasks have consistently found that while immediate 
feedback results in more efficient learning, delayed feedback results in improved retention and 
fewer errors on subsequent assessments (Schmidt & Bjork, 1992). Third, summative feedback in 
the current study was still somewhat immediate in the sense that it was given right after the 
practice task. Several studies have found advantages to providing summative feedback the next 
day rather than after the task (e.g., Bardwell, 1981; Butler et al., 2007; but see Dihoff et al., 
2004). More generally, future research needs to parse the differences between trial-by-trial 
versus summative and immediate versus delayed. Often, “delayed” feedback is relative to the 
conditions used in a particular experiment making it difficult to compare across studies. 
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Future research should also continue to examine how feedback relates to learners’ prior 
knowledge. The current findings are consistent with previous work demonstrating that feedback 
often has stronger, positive effects for children with low prior knowledge on a pretest than 
children with higher prior knowledge. For example, feedback on middle-school students’ writing 
assignments improved performance for learners with low prior knowledge on a pretest, but not 
for learners with higher prior knowledge (Gielen et ah, 2010). Similarly, undergraduate students 
with high prior knowledge of statistics performed just as well on a posttest whether they received 
correct-answer feedback during training or not (Krause et al., 2009). Indeed, the biggest benefits 
of feedback in this study occurred for low-knowledge children during training and at posttest. 

However, these results are slightly at odds with our previous study. In Fyfe and Rittle- 
Johnson (2015), children who received prior instruction exhibited better learning without 
feedback. Further, this negative effect of feedback did not depend on children’s prior knowledge 
on the pretest. Here, children who received prior instruction exhibited better learning with 
feedback, though this was more pronounced for children with low prior knowledge. Why were 
the effects of feedback positive in this study? One possibility is that feedback is more effective in 
environments that reduce attention on the self. There are three critical differences between the 
current study and the previous study: (1) the presence of the human tutor during problem solving 
was removed and feedback was provided solely by the computer, (2) the feedback message only 
contained the correct answer; it did not contain an explicit right/wrong judgment, and (3) the 
posttest was administered the following day as opposed to immediately after the learning task. 
The first two changes reduce attention to self and evaluation, while the third change allows for 
any effects that may arise from attention on the self to dissipate prior to testing. These differing 
results are potentially consistent with the hypothesis that feedback will lead to larger gains when 
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it draws attention to the task as opposed to the self (Kluger & DeNisi, 1996), though future 
experimental research is needed to test the causal nature of this claim. 

In conclusion, the present study provides experimental evidence for the benefits of 
computer-generated feedback on children’s mathematics learning. Specifically, for children 
given instruction on the target problems, both immediate and summative feedback resulted in 
higher performance on a next-day posttest relative to a no-feedback control for children with 
lower prior knowledge. Further, immediate feedback increased transfer and mastery of the 
material for both low- and high-knowledge children. These results reinforce the notion that even 
minimal feedback on a computer can be a powerful form of guidance during problem solving. 
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